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Abstract 

In computer simulations performed in constant number of particles en- 
sembles, although the total number of particles N contained in the simulation 
box does not fluctuate, hence giving a zero apparent compressibility, there 
are still local fluctuations in the number of particles. It is shown herein that 
these apparent fluctuations produce a compressibility that can be computed 
from the calculated radial distribution function, and which matches to a 
great accuracy the compressibility of the fluid for the open system. 

This statement implies that the radial distribution function evaluated 
in simulation of constant number of particles is identical to that evaluated 
in the grand canonical ensemble, for the entire distance range within half- 
box width. This is illustrated for the hard sphere and Lennard- Jones fluids 
and for molecular models of water. The origin of this apparent fluctuation 
is that the bulk of the remaining particles, outside the range over which 
the distribution function is calculated, act as a reservoir of particles for 
those within this range, thanks to the periodic boundary conditions. The 
implications on the calculation of the Kirkwood-Buff integrals are discussed. 
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1 Introduction 

The statistical theory of liquids is now mostly textbook knowledge. Within such 
a theory one shows that the isothermal compressibility of a liquid is related to the 
fluctuation in number of particles of the system, through a fluctuation-dissipation 
type relation: 



where Xt = P/P * s the ideal gas compressibility, (3 = l/k B T is the Boltzmann 
factor and p = N/V the number density. The first equality in (ED) is a thermody- 
namic definition, while the two last equalities are derived in the grand canonical 
ensemblejTJ, where the fluctuation in the number of particle is allowed. G in this 
equation is the integral of the radial distribution function (RDF) g(r) 



From Eq.fpQ) it is seen that the compressibility is related to the fluctuation in 
number of particles through the second equality. Therefore, we do not expect these 
relations to hold in any constant N ensemble such as the micro-canonical (constant 
N,V,E), canonical (constant N,V,T) or isobaric (constant N,P,T) ensembles. In 
such ensembles, the last equality is expected to give 1 + pG = 0. Many textbooks 
show that this is indeed the case, when G is evaluated in N-constant ensemble. 
One such demonstration is provided in the next section of this report. Since the 
RDF can be evaluated by simulations in different ensembles, one expects the above 
constraint to be verified in N-constant ensembles. We show here that this is not 
the case, and that, in fact, the effective compressibility evaluated through Eq.(pQ) 
using the RDF, as well as the RDF itself, evaluated in the canonical ensemble, for 
example, match the expected ones with great accuracy. This contradiction is lifted 
by considering contributions from the periodic boundary conditions, as discussed 
later. In fact, this result should not surprise us, since it is well known that the 
chemical potential can be evaluated from simulations in N-constant ensemble [2]. 
The procedure consist in evaluating the insertion free energy of an additional 
particle, which implicitly supports the existence of local density fluctuations. It is 
these local fluctuations that give rise to the apparent global density fluctuation, 
in the very absence of any such macroscopic feature. 

Such behaviour is observed equally in mixtures, where, in addition to density 
fluctuation, concentration fluctuation also play an important role. Hence, it gives 
some additional support for the evaluation, through computer simulations, of the 
so-called Kirkwood-Buff integrals (KBIs)[3|, that have attracted recent interest in 
the modeling of the force field of aqueous mixturespj [5]. Indeed, such mixtures 
tend to show appreciable micro-segregation, that can be detected to some extent 
through the comparison with the experimental KBIs|4], [6], [7]- Since these quanti- 
ties are simply the integrals of the various site-site radial distribution functions, 
the evaluation of the latter by computer simulations has an undeniable interest. 
However, the theory behind the KBIs is strictly a grand canonical approach [3]. 
Then, one is interested to know what are the limitations of the RDF computed in 
constant-N ensemble simulations 

It is worthwhile mentioning that, the problematic of comparing the RDF eval- 
uated in different ensembles, have been addressed in the past by few authorsjH [9]. 




(1) 




(2) 



2 



To our knowledge, however, the fact that the isothermal compressibility and the 
RDF match that evaluated from grand canonical ensemble has not been addressed 
in present terms. 

The paper is organized as follows; the next section presents the theoretical 
material that is needed to clarify the issue raised here . The results section contains 
an illustration on the hard sphere, Lennard- Jones and some water models. Finally 
the discussions and conclusions are given in section 4. 

2 Theoretical details 

As stated in the introduction, there is a connection between the fluctuation of the 
number of particles N and the isothermal compressibility Xt, which is found in 
many textbooks of the liquid state theory [T]. We recall here briefly the main steps 
for completeness sake. 

The RDF is a key quantity in the analysis of liquids since it provides a direct 
information about the microstructure of liquids. This function is related to the 
second member of the whole hierarchy of density correlation functions, that can 
be constructed as statistical moments from one single function, the microscopic 
density in a N-particles system 

N 

rti)=l>(i-i) (3) 

i=0 

with S(i) being the Dirac delta function and with the convention that 1 = (ri,^) 
represents the position and the orientation of molecule labeled 1. This definition 
is equivalent to a snapshot of the instantaneous position and orientation of the 
particles, and thus represents one microstate of the system. It is interesting to 
note that this quantity is the basic observable that can be perused by computer 
simulations. In practice, however, we are essentially interested in averages of 
this quantity and its various correlations. These averages can be performed in 
different ensembles. For a given ensemble, one can define, for example, the one- 
body function p^(l) =< p(l) > which for the translationally and rotationally 
invariant systems (homogeneous and isotropic) that we consider in this report is 
simply 

pW{l)=<p(l)>=p = p/u (4) 

that is the number density p = N/V divided by the solid angle u (where to = 4n 
or 8tt 2 depending on the symmetry of the molecule). 

Going one step further, one can equally define the second moment, namely the 
two particles density as 

2) =< p(l)p(2) - 5(1, 2)p(l) >=< J2 ^ - 1W " 2 ) > (5) 

The pair correlation function g(l,2) is then defined by 

(1,2) =^(1,2) (6) 

This definition implies that when the particles are uncorrelated, that is for example 
when they are infinitely far apart, then one has exactly p( 2 '(l,2) = p 2 , which 
means: 

lim 0(1,2) = 1 (7) 
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It is clear that this relationship is trivially valid for an infinite system, but must be 
revised for a finite system with N particles in a volume V. This is, for example, the 
case of the micro-canonical (constant NVE), canonical (constant NVT) or isobaric 
(constant NPT) statistical ensembles, in which most simulations are performed. 
In the following discussion, we take as example the canonical ensemble, but the 
reasoning equally holds for the other two ensembles. 

For the translationally and rotationally invariant systems considered in this 
report, equation ((6|) together with (J3]) leads to 

(7(1,2) = N{N ~ l) [ d3...dNexp(-(3V3(N)) (8) 

P J 

where = J dl...dN exp(— /3QJ(JV)) is the canonical ensemble partition function, 
with QJ(iV) being the total interaction energy between the N particles. The RDF 
g(r) is then defined as the angle average of g(l, 2) as: 

9{r) = ^ J dMks(l,2) (9) 

It is perhaps worthwhile noting here that the definition (jHJ) contains the factor 
N(N — 1)/N 2 , where the numerator arises from the number of pairs in an ensemble 
of N particles, while the denominator is coming from the factor p 2 in (J6j). It is this 
factor that is responsible for the asymptotic limit of the RDF in finite systems. 

Let us examine the asymptotic behaviour from Eqs. (l8|9l) when r — » oo. From 
Eq.jEJ), when r 12 — > oo , that is when the two particles are far apart, one can neglect 
the interaction between them in 23(iV), hence one gets directly from Eqs. (l8|9l . for 
the canonical ensemble 

lim g(r) = 1 - 1/JV (10) 

) — »oo 

This relation hold equally in the micro-canonical ensemble. In the isobaric, con- 
stant NPT ensemble, the corresponding limit is g(r) — > (1 — 1/N)X(V), where A(V) 
is a formal function whose form is given in the appendix. This size dependence is 
expected to be observed in the canonical ensemble, and one can see from Eq.(pQ) 
that the integral G should trivially satisfy 

l + p G = (11) 

since N does not fluctuate. 

The RDF can also be directly obtained from computer simulations from Eq.j5]). 
This equation tells us that we only need to compute the histogram H(r, Ar) which 
counts the pairs of particles at a distance between r and r + Ar. The resulting 
expression is: 

H(r,Ar) , , 

9 « = fPAndr) <12) 

where AV(r, Ar) = (47rr 2 Ar)/Vo is the volume of the spherical element consid- 
ered, reduced by the total volume V of the simulation cell. The above expression 
is the equivalent of the second equality in Eq.([5|). In practice, this calculation is 
performed and averaged over several configurations or microstates. It is very im- 
portant to note that Eq.(fT2il is valid for any statistical ensemble. It is the nature 
of fluctuations of a particular ensemble that will affect the form the histogram 
H(r, Ar) and determine the corresponding asymptotic behaviour of g(r). Hence, 
it is not appropriate to introduce additional factors into this expression, in order 
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to enforce expected behaviour. This point is worth mentioning, since incorrect 
additional factors are often found in some algorithms. 

An important remark must be made at this point. The computer simulations of 
the various N constant ensembles are not strictly concerned by arguments about 
finite or closed systems, since the periodic boundary conditions allow to mimic 
an infinite system. This is, of course, size dependent. It turns out that this fact 
permits some kind of pseudo-fluctuations in the number of particles within specific 
subvolumes, that we consider to be local fluctuations, while the total number of 
particles in the simulation cell does not fluctuate. In order to grasp what this 
feature could induce, let us imagine a very large system, that is constrained to 
be finite (a fluid in a huge box). Then, we consider computing the g(r) in a very 
small sub-system inside this large system. Clearly, the outer part of the system acts 
as a particle reservoir, and we naturally expect the resulting g(r) to contain the 
fluctuation of particles correctly described, that is the correct asymptotic limit, i.e. 
unity. Since the g(r) of small systems are computed within a sub cell of the system 
box, usually in a spherical shell with radius about L/2, where L is the simulation 
box size, we expect some influence on the g(r) of the fluctuations of the number 
of particles of this subsystem. This is what we examine in the next section. It 
should be clear, at this point, that our considerations about apparent fluctuations 
hold only for subvolumes smaller than that of the largest sphere inscribed within 
the cubical box, and that fluctuations beyond this range will inevitably show that 
the total number of particles is fixed. 

These definitions can be equally extended to mixtures. In particular, the 
Kirkwood-Buff integrals are defined between species labeled a and f3 as: 



where g a p{r) is the corresponding RDF. 

3 Results 

Nothing, in the general formalism outlined above, prepare for the actual results we 
have observed in our simulations. We have evaluated the RDF and the resulting 
compressibility (through (JT|)) for the hard sphere fluid and two water models. 
Constant NVT Monte Carlo (MC) simulations have been performed for the hard 
sphere and Lennard- Jones fluids, for various system sizes (N=500, 2048, 4000). 
These two models are the archetypal models for simple liquids, and thus serve 
as a perfect basis to test the claims. Molecular Dynamics simulations in the 
constant NPT ensemble have been performed for the SPC/E[T0] and TIP5P[TT] 
water models, with N=864, 2048 and N=10976. The calculations show in an 
unbiased manner that the RDF have the correct asymptotic limit, i.e. unity. 

3.1 The hard sphere fluid 

One component hard sphere (HS) fluid has been equilibrated using a constant 
NVT Monte Carlo code, for various system sizes, and for the fixed reduced number 
density p* = (N/V)a 3 = 0.8, where a is the hard sphere diameter. This density 
can be considered as dense enough for a liquid phase. The RDF for systems with 
N=500, 2048 and 10976 have been computed with statistics performed over 100 
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million particle moves. This exceeds by far the usually encountered numbers, but 
it is necessary in order to have accurate evaluation of the g(r) at large distances. 

Fig.l shows the RDFs in the entire range, and the inset shows the integrand 
y(r) = (g(r) — l)r 2 . The expected asymptotes —r 2 /N are also plotted, each for 
the range over which the RDFs have been evaluated. The inset indicates clearly 
that y(r) oscillates around zero, rather than the expected asymptotes. The RDF 
calculated from the Percus-Yevick (PY) theory is equally shown. Since this RDF is, 
by definition, calculated in the grand canonical ensemble, it serves as a test of the 
asymptotic limit, despite being an approximate theory. One notices in particular 
the very good agreement at large distances between the oscillatory structure from 
the PY theory and the simulations with N=10976, which is clearly seen in the 
inset. 

3.2 The Lennard-Jones fluid 

The one component Lennard-Jones (LJ) fluid has been also studied by an NVT 
constant MC algorithm, for reduced density p* = 0.9 and reduced temperature 
T* = k B T/e = 3.0 (where e is the energy depth of the LJ interaction). This choice 
corresponds to a point in the dense and high temperature state of the LJ fluid. 
Systems sizes of N=500, 2048 and 10976 have been studied, with statistics similar 
to that of the hard sphere fluid. 

Fig. 2 is the analogous of Fig.l but for the LJ fluid. Once again, it is observed 
that the RDF tends to unity, despite the fact that we are in an constant N ensemble. 
The PY results are also plotted and serve as a reference, despite the approximate 
nature of this theory. In the high temperature region, this theory is expected to 
be more appropriate than the hypernetted chain (HNC) theory, since in the high 
temperature regime the correlations are expected to be dominated by excluded 
volume effects, and be closer to the HS regimep]. It is observed that the long 
range oscillatory structures, between the simulations and the theory, are again in 
very good agreement. The large oscillations for the N= 10976 system are due to 
the high temperature, and are difficult to reduce, despite statistics collected over 
100 million moves. Again it is observed that none of the data follow the expected 
1/N asymptotes that are plotted in the inset. 

Fig. 3 shows the running reduced compressibility Xt = Xt/Xt f° r the same 
system. The numerical value obtained from the PY theory is Xt = 0.05254, shown 
as an horizontal line, and serves as a comparison point for the unknown exact 
compressibility for this state point. The agreement between the asymptotic limit 
evaluated through Eqs.( fTf2i ) indicates that the pseudo macroscopic density fluctu- 
ation in this system where neither N nor V are changing, tend to a non zero value, 
which is quite nicely seen for the N=10976 system. This value of the compressibil- 
ity, estimated to be about 0.035, is below both that of the approximate PY and 
HNC(0. 07564) theories, which tends to overestimate the correct compressibility. 
The virial pressures (compressibility factor Z = (3P/p) obtained from the HNC 
and PY theories are 6.398 and 4.90, respectively, while that obtained from the sim- 
ulations is 5.140. The reduced energies per particle are -1.167 (HNC), -1.453(PY) 
and -1.395 (MC). The simulated values are seen to be in between those of the two 
theories. This reflects the trend generally observed that the two theories bracket 
the exact results. 

As a side remark, it is rather worthy of note that the convergence of the value 
of the compressiblity should need systems sizes as large as N= 10976 for systems 
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as simple as the HS or LJ fluids. Properties such as the pressure or the internal 
energy, converge faster, that is within a smaller range of distance. The typical 
example of this is the pressure for hard spheres that requires only the contact 
value of the g(r). The principal reason for these properties to converge faster is 
that the integrand is weighted by the interaction (or the derivative), which screens 
the long range oscillatory behaviour of the RDF. 

3.3 Water models 

Molecular Dynamics simulations in constant NPT ensemble, using the DL-POLY2 
code [12], have been performed for two water models, namely the SPC/E and 
TIP5P models, under ambient conditions. The oxygen-oxygen RDF goo( r ) nave 
been computed for several system sizes, which is N=864, 2048 and 10976 for the 
SPC/E model, and N=2048 for the TIP5P model. Averages have been performed 
over several hundreds of thousand steps. 

Fig. 4 shows the RDF, and the inset shows the integrand (gooij) — l)r 2 . Again, 
the asymptotic limit is near perfectly unity, in contradiction with the fact that 
this calculation is done in the N,P,T constant ensemble, where the strict appli- 
cation of the arguments developed in Section 2, give the following expectation, 
lim r ^ 00 5f(r) = (1 — 1/N)\(V) (see the appendix). 

Fig. 5 shows the experimental and calculated KB integrals from simulations. 
The experimental compressibility at room temperature is 0.4533GPa _1 [l3]. From 
this value we estimate the experimental G to be G = — 16.9cm 3 /mol, which is 
shown as an horizontal line. The agreement is very good, indicating again that 
the true and apparent compressibilities are in good correspondence. One can note 
the fact that both water models reproduce well the experimental compressibility, 
which is an indication of their accuracy with that respect (TIP4P is equally in the 
same accuracy region, although not shown here). 

4 Discussion and conclusion 

The results shown in the preceding section demonstrate numerically that the ap- 
parent density fluctuations in N constant ensemble simulations match near per- 
fectly those of a corresponding system where N would be allowed to fluctuate. 
In addition, the RDF evaluated in such ensemble tend to the correct asymptotic 
behaviour, i.e. unity. These findings are purely empirical, and nothing in the the- 
oretical considerations of Section 2 allows us to anticipate or justify these findings. 

Examining the conditions of the realization of the various N-constant ensemble 
in computer simulations, we notice that these ensembles are not "closed", since 
they are made infinite through the periodic boundary conditions. In fact, a finite 
constant N system cannot be "closed" in the sense of having a fixed boundary, since 
this boundary will always influence the distribution of the particles by making the 
system inhomogeneous, and thus rendering useless the formalism exposed in the 
section 2. In this, the term "closed" used in Ref.[9|, and various subsequent reports 
by other authors, is totally inappropriate. The absence of boundaries permits an 
effective fluctuation of the number of particles inside any subvolume, and the 
numerical evidence here simply indicates that this fluctuation is enough to induce 
an apparent macroscopic fluctuation through the RDF, that is near exactly that of 
the open system. The fact that a small system of few hundreds to few thousands 
particles can reproduce a macroscopic quantity is not new; many macroscopic 
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thermodynamic quantities are evaluated within such small microscopic subsystems 
that are the simulation cells. In fact, as stated in the introduction, the very fact 
that one can use an N-constant cell to compute the chemical potential, indicates 
that the fluctuations in the system are the appropriate response to local variations 
in pressure and volume. 

One of the interesting conclusions of this study is the perspective of its appli- 
cation to mixtures, and particularly to aqueous mixtures. Indeed, in such systems, 
concentration fluctuations are very important, due to specific self-clustering ten- 
dencies induced by the ability of water molecules to link together through hydrogen 
bonds. One of the questions that have aroused in recent works [H [7] is the ability of 
the constant N simulations to reproduce properly the correct long range behaviour 
of the various RDFs, and in particular that between water molecules. The present 
study gives some confidence in the constant N simulation studies of such systems, 
once care is taken to consider sizes large enough to support local immiscibility 
without leading to macroscopic phase separation. 



Appendix 

The computation of the formal limit of g(r) in the NPT ensemble follows a logic 
similar to that exposed in Section 2. We start from the equivalent of (8) which is 
now 

g{l,2) = - I dVexp(-(3PV) I d3...dN exp(-pMN)) (14) 

P 2 Qn Jo J 

where Qn = f °° dV exp( — /3PV) f dl...dN exp(— f3%$(N)) is the partition function 
of the (N,P,T) constant isobaric ensemble|Tj. The number density p is now defined 
as p = N/V, where V is the effective average volume of the system under constant 
pressure P. When r 12 — > oo, that is when the interaction between particles 1 and 
2 can be neglected, Eq. ffTil) reduces to 

, x N(N-l) 1 , , 

'C'^V^y (15) 



where 



2 , f °°dVV 2 exp(-pPV)f(V) 



< V 2 > = JU " v - /J v 7 (16) 

^dVeM-pPV)f(V) 1 1 

with /(V) = J d3...dN exp(— /32J'(iV)). The ' indicates that the volume average is 
taken for (N-3) particles only. Eq. ffl~5l) can be rewriten as 

The second fraction involving the volumes can be evaluated easily in the ideal gas 
limit, but is not so for the case the fully interacting system. Nevertheless, one sees 
that the asymptotic limit of the isotropic part of g(l,2) is not trivially unity, as 
would be expected in an open system. 
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Figure captions 



1. Fig.l. {color online). RDF of the hard sphere fluid at p* = 0.8. MC 
simulation results for N=500 (green), N=2048 (blue) and N=10976 (red). 
PY result in black. Inset: plots of y(r) = (g(r) — l)r 2 with same color 
conventions. 

2. Fig. 2. (color online). RDF of the Lennard- Jones fluid at p* = 0.8 and 
T* = 0.9. MC simulation results for N=500 (green), N=2048 (blue) and 
N=10976 (red). HNC result in black. Inset: plots of y(r) = (g(r) - l)r 2 
with same color conventions. 

3. Fig. 3. (color online). Running (reduced) compressibility integral (see text) 
for the system shown in Fig. 2, with same color convention. The horizontal 
line is the corresponding HNC compressibility. 

4. Fig. 4. (color online). Water oxygen-oxygen RDF from MD simulations for 
pure water at ambient conditions. SPC/E with N=864 (magenta); SPC/E 
with N=2048 (blue); SPC/E with N=10976 (black); TIP5P with N=2048 
(cyan). The inset shows the corresponding y(r). 

5. Fig. 5. (color online). The running KB integral for the systems shown in 
Fig. 4, with same conventions. The horizontal line is the experimental result 
(see text). 
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